智能论文笔记

Kernel Inversed Pyramidal Resizing Network for Efficient Pavement Distress Recognition

Rong Qin , Luwen Huangfu , Devon Hood , James Ma , Sheng Huang

分类：计算机视觉 | 人工智能

2022-12-04

Pavement Distress Recognition (PDR) is an important step in pavement inspection and can be powered by image-based automation to expedite the process and reduce labor costs. Pavement images are often in high-resolution with a low ratio of distressed to non-distressed areas. Advanced approaches leverage these properties via dividing images into patches and explore discriminative features in the scale space. However, these approaches usually suffer from information loss during image resizing and low efficiency due to complex learning frameworks. In this paper, we propose a novel and efficient method for PDR. A light network named the Kernel Inversed Pyramidal Resizing Network (KIPRN) is introduced for image resizing, and can be flexibly plugged into the image classification network as a pre-network to exploit resolution and scale information. In KIPRN, pyramidal convolution and kernel inversed convolution are specifically designed to mine discriminative information across different feature granularities and scales. The mined information is passed along to the resized images to yield an informative image pyramid to assist the image classification network for PDR. We applied our method to three well-known Convolutional Neural Networks (CNNs), and conducted an evaluation on a large-scale pavement image dataset named CQU-BPDD. Extensive results demonstrate that KIPRN can generally improve the pavement distress recognition of these CNN models and show that the simple combination of KIPRN and EfficientNet-B3 significantly outperforms the state-of-the-art patch-based method in both performance and efficiency.

translated by 谷歌翻译

PicT: A Slim Weakly Supervised Vision Transformer for Pavement Distress Classification

Wenhao Tang , Sheng Huang , Xiaoxian Zhang , Luwen Huangfu

分类：计算机视觉

2022-09-21

自动路面遇险分类有助于提高路面维护的效率并降低劳动力和资源的成本。该任务的最近有影响力的分支将路面图像划分为贴片，并从多实体学习的角度解决了这些问题。但是，这些方法忽略了斑块之间的相关性，并且在模型优化和推理中遇到了低效率。同时，Swin Transformer能够以其独特的优势来解决这两个问题。我们构建了Swin Transformer，我们提供了一个名为\ TextBf {p} avement \ textbf {i} mage \ textbf {c} lassification \ textbf {t} ransformer（\ textbf {pict}）的视觉变压器。为了更好地利用贴片级别的路面图像的判别信息，提出了\ textit {patch labeling conterg}，以利用教师模型在每次迭代期间从图像标签中动态生成贴片的伪标签，并将模型引导到模型上了解补丁的判别特征。 Swin Transformer的广泛分类头可能会稀释特征聚合步骤中遇险斑块的判别特征，这是由于路面图像的遇险面积较小。为了克服这个缺点，我们提出了一个\ textit {Patch Refiner}将补丁聚集到不同的组中，并且仅选择最高的遇险风险组来产生最终图像分类的纤细头部。我们在CQU-BPDD上评估了我们的方法。广泛的结果表明，\ textbf {pict}在检测任务中，p@r中的$+2.4 \％$的大幅度优于第二好的模型，$+3.9 \％\％\％$ f1 $ f1 $ in识别任务和识别任务和1.8倍吞吐量，同时使用相同的计算资源享受7倍的训练速度。我们的代码和模型已在\ href {https://github.com/dearcaat/pict} {https://github.com/dearcaat/pict}上发布。

translated by 谷歌翻译

An Iteratively Optimized Patch Label Inference Network for Automatic Pavement Distress Detection

Wenhao Tang , Sheng Huang , Qiming Zhao , Ren Li , Luwen Huangfu

分类：计算机视觉

2020-05-27

我们提出了一个新颖的深度学习框架，称为迭代优化的补丁标签推理网络（IOPLIN），用于自动检测不仅限于特定的路面困扰，例如裂缝和坑洼。 Ioplin可以通过预期最大化启发的补丁标签蒸馏（EMIPLD）策略进行迭代训练，并通过从路面图像中推断贴片标签来很好地完成此任务。 Ioplin在最先进的单个分支CNN模型（例如Googlenet和ExcelificeNet）上享有许多理想的属性。它能够处理不同分辨率中的图像，并充分利用图像信息，尤其是对于高分辨率图像，因为Ioplin从未修复的图像贴片中提取了视觉特征，而不是整个大小的整个图像。此外，它可以在训练阶段使用任何先前的本地化信息而大致地将路面困扰定位。为了更好地评估我们方法在实践中的有效性，我们构建了一个名为CQU-BPDD的大规模沥青疾病检测数据集，该数据集由60,059个高分辨率路面图像组成，这些数据集在不同的时间从不同地区获取。该数据集的广泛结果证明了Ioplin在自动路面遇险检测中的最先进图像分类方法的优势。 The source codes of IOPLIN are released on \url{https://github.com/DearCaat/ioplin}, and the CQU-BPDD dataset is able to be accessed on \url{https://dearcaat.github.io/CQU -bpdd/}。

translated by 谷歌翻译

Robust Split Federated Learning for U-shaped Medical Image Networks

Ziyuan Yang , Yingyu Chen , Huijie Huangfu , Maosong Ran , Hui Wang , Xiaoxiao Li , Yi Zhang

分类：计算机视觉

2022-12-13

U-shaped networks are widely used in various medical image tasks, such as segmentation, restoration and reconstruction, but most of them usually rely on centralized learning and thus ignore privacy issues. To address the privacy concerns, federated learning (FL) and split learning (SL) have attracted increasing attention. However, it is hard for both FL and SL to balance the local computational cost, model privacy and parallel training simultaneously. To achieve this goal, in this paper, we propose Robust Split Federated Learning (RoS-FL) for U-shaped medical image networks, which is a novel hybrid learning paradigm of FL and SL. Previous works cannot preserve the data privacy, including the input, model parameters, label and output simultaneously. To effectively deal with all of them, we design a novel splitting method for U-shaped medical image networks, which splits the network into three parts hosted by different parties. Besides, the distributed learning methods usually suffer from a drift between local and global models caused by data heterogeneity. Based on this consideration, we propose a dynamic weight correction strategy (\textbf{DWCS}) to stabilize the training process and avoid model drift. Specifically, a weight correction loss is designed to quantify the drift between the models from two adjacent communication rounds. By minimizing this loss, a correction model is obtained. Then we treat the weighted sum of correction model and final round models as the result. The effectiveness of the proposed RoS-FL is supported by extensive experimental results on different tasks. Related codes will be released at https://github.com/Zi-YuanYang/RoS-FL.

translated by 谷歌翻译

WAIR-D: Wireless AI Research Dataset

Yourui Huangfu , Jian Wang , Shengchen Dai , Rong Li , Jun Wang , Chongwen Huang , Zhaoyang Zhang

分类：机器学习

2022-12-05

It is a common sense that datasets with high-quality data samples play an important role in artificial intelligence (AI), machine learning (ML) and related studies. However, although AI/ML has been introduced in wireless researches long time ago, few datasets are commonly used in the research community. Without a common dataset, AI-based methods proposed for wireless systems are hard to compare with both the traditional baselines and even each other. The existing wireless AI researches usually rely on datasets generated based on statistical models or ray-tracing simulations with limited environments. The statistical data hinder the trained AI models from further fine-tuning for a specific scenario, and ray-tracing data with limited environments lower down the generalization capability of the trained AI models. In this paper, we present the Wireless AI Research Dataset (WAIR-D)1, which consists of two scenarios. Scenario 1 contains 10,000 environments with sparsely dropped user equipments (UEs), and Scenario 2 contains 100 environments with densely dropped UEs. The environments are randomly picked up from more than 40 cities in the real world map. The large volume of the data guarantees that the trained AI models enjoy good generalization capability, while fine-tuning can be easily carried out on a specific chosen environment. Moreover, both the wireless channels and the corresponding environmental information are provided in WAIR-D, so that extra-information-aided communication mechanism can be designed and evaluated. WAIR-D provides the researchers benchmarks to compare their different designs or reproduce results of others. In this paper, we show the detailed construction of this dataset and examples of using it.

translated by 谷歌翻译